Convolutional Pose Machines: A Deep Architecture for Estimating Articulated Poses
نویسندگان
چکیده
Pose Machines provide a sequential prediction framework for learning rich implicit spatial models. In this work we show a systematic design for how convolutional networks can be incorporated into the pose machine framework for learning image features and image-dependent spatial models for the task of pose estimation. The contribution of this paper is to implicitly model long-range dependencies between variables in structured prediction tasks such as articulated pose estimation. We achieve this by designing a sequential architecture composed of convolutional networks that directly operate on belief maps from previous stages, producing increasingly refined estimates for part locations, without the need for explicit graphical model-style inference. Our approach addresses the characteristic difficulty of vanishing gradients during training by providing a natural learning objective function that enforces intermediate supervision, thereby replenishing back-propagated gradients and conditioning the learning procedure. We demonstrate state-of-the-art performance and outperform competing methods on standard benchmarks including the MPII, LSP, and FLIC datasets.
منابع مشابه
MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation
In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features. We propose a new human body pose dataset, FLIC-motion, that extends the FLIC dataset [1] with additional motion features. We apply our architecture to this dataset and report significantly better perf...
متن کاملReal-Time Biologically Inspired Action Recognition from Key Poses Using a Neuromorphic Architecture
Intelligent agents, such as robots, have to serve a multitude of autonomous functions. Examples are, e.g., collision avoidance, navigation and route planning, active sensing of its environment, or the interaction and non-verbal communication with people in the extended reach space. Here, we focus on the recognition of the action of a human agent based on a biologically inspired visual architect...
متن کاملLearning Markerless Human Pose Estimation from Multiple Viewpoint Video
We present a novel human performance capture technique capable of robustly estimating the pose (articulated joint positions) of a performer observed passively via multiple view-point video (MVV). An affine invariant pose descriptor is learned using a convolutional neural network (CNN) trained over volumetric data extracted from a MVV dataset of diverse human pose and appearance. A manifold embe...
متن کاملArticulated Pose Estimation by a Graphical Model with Image Dependent Pairwise Relations
We present a method for estimating articulated human pose from a single static image based on a graphical model with novel pairwise relations that make adaptive use of local image measurements. More precisely, we specify a graphical model for human pose which exploits the fact the local image measurements can be used both to detect parts (or joints) and also to predict the spatial relationships...
متن کاملJoint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation
This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field. We show how this architecture is successfully applied to the challenging problem of articulated human pose estimation in monocular images. The architecture can exploit structural domain constraints such as geometric relationships between body joint locations. We show that joint...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016